multiClust: An R-package for Identifying Biologically Relevant Clusters in Cancer Transcriptome Profiles

نویسندگان

  • Nathan Lawlor
  • Alec Fabbri
  • Peiyong Guan
  • Joshy George
  • R. Krishna Murthy Karuturi
چکیده

Clustering is carried out to identify patterns in transcriptomics profiles to determine clinically relevant subgroups of patients. Feature (gene) selection is a critical and an integral part of the process. Currently, there are many feature selection and clustering methods to identify the relevant genes and perform clustering of samples. However, choosing an appropriate methodology is difficult. In addition, extensive feature selection methods have not been supported by the available packages. Hence, we developed an integrative R-package called multiClust that allows researchers to experiment with the choice of combination of methods for gene selection and clustering with ease. Using multiClust, we identified the best performing clustering methodology in the context of clinical outcome. Our observations demonstrate that simple methods such as variance-based ranking perform well on the majority of data sets, provided that the appropriate number of genes is selected. However, different gene ranking and selection methods remain relevant as no methodology works for all studies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Author's response to reviews Title: The Colorectal cancer disease-specific transcriptome may facilitate the discovery of more biologically and clinically relevant information Authors:

Title: The Colorectal cancer disease-specific transcriptome may facilitate the discovery of more biologically and clinically relevant information

متن کامل

Author's response to reviews Title: The Colorectal cancer disease-specific transcriptome facilitates the discovery of more biologically and clinically relevant information Authors:

Title: The Colorectal cancer disease-specific transcriptome facilitates the discovery of more biologically and clinically relevant information

متن کامل

Efficient methods for identifying mutated driver pathways in cancer

MOTIVATION The first step for clinical diagnostics, prognostics and targeted therapeutics of cancer is to comprehensively understand its molecular mechanisms. Large-scale cancer genomics projects are providing a large volume of data about genomic, epigenomic and gene expression aberrations in multiple cancer types. One of the remaining challenges is to identify driver mutations, driver genes an...

متن کامل

Inferring cluster-based networks from differently stimulated multiple time-course gene expression data

MOTIVATION Clustering and gene network inference often help to predict the biological functions of gene subsets. Recently, researchers have accumulated a large amount of time-course transcriptome data collected under different treatment conditions to understand the physiological states of cells in response to extracellular stimuli and to identify drug-responsive genes. Although a variety of sta...

متن کامل

iBBiG: iterative binary bi-clustering of gene sets

MOTIVATION Meta-analysis of genomics data seeks to identify genes associated with a biological phenotype across multiple datasets; however, merging data from different platforms by their features (genes) is challenging. Meta-analysis using functionally or biologically characterized gene sets simplifies data integration is biologically intuitive and is seen as having great potential, but is an e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2016